AITopics | distance matter

Collaborating Authors

distance matter

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Distance Matters For Improving Performance Estimation Under Covariate Shift

Roschewitz, Mélanie, Glocker, Ben

arXiv.org Artificial IntelligenceAug-14-2023

Performance estimation under covariate shift is a crucial component of safe AI model deployment, especially for sensitive use-cases. Recently, several solutions were proposed to tackle this problem, most leveraging model predictions or softmax confidence to derive accuracy estimates. However, under dataset shifts, confidence scores may become ill-calibrated if samples are too far from the training distribution. In this work, we show that taking into account distances of test samples to their expected training distribution can significantly improve performance estimation under covariate shift. Precisely, we introduce a "distance-check" to flag samples that lie too far from the expected distribution, to avoid relying on their untrustworthy model outputs in the accuracy estimation step. We demonstrate the effectiveness of this method on 13 image classification tasks, across a wide-range of natural and synthetic distribution shifts and hundreds of models, with a median relative MAE improvement of 27% over the best baseline across all tasks, and SOTA performance on 10 out of 13 tasks. Our code is publicly available at https://github.com/melanibe/distance_matters_performance_estimation.

covariate shift, distance matter, performance estimation

arXiv.org Artificial Intelligence

2308.07223

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

METCC: METric learning for Confounder Control Making distance matter in high dimensional biological analysis

Manghnani, Kabir, Drake, Adam, Wan, Nathan, Haque, Imran

arXiv.org Machine LearningDec-7-2018

High-dimensional data acquired from biological experiments such as nextgeneration sequencingare subject to a number of confounding effects. These effects include both technical effects, such as variation across batches from instrument noiseor sample processing ("batch effects"), or institution-specific differences insample acquisition and physical handling ("institutional variability"), as well as biological effects arising from true but irrelevant differences in the biology of each sample, such as age biases in diseases. Prior work has used linear methods toadjust for such batch effects. Here, we apply contrastive metric learning by a nonlinear triplet network to optimize the ability to distinguish biologically distinct sample classes in the presence of irrelevant technical and biological variation. Usingwhole-genome cell-free DNA data from 817 patients, we demonstrate that our approach, METric learning for Confounder Control (METCC), is able to match or exceed the classification performance achieved using a best-in-class linear method(HCP) or no normalization. Critically, results from METCC appear less confounded by irrelevant technical variables like institution and batch than those from other methods even without access to high quality metadata information requiredby many existing techniques; offering hope for improved generalization.

artificial intelligence, machine learning, metcc, (16 more...)

arXiv.org Machine Learning

1812.03188

Country: North America > United States (0.47)

Genre: Research Report (0.55)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.32)

Add feedback